Biostar

11,388 results • Page 2 of 228

Im wondering about the most straightforward way to extract the interval information contained in a fasta header such as the one below, thanks! Also maybe to pipe into a newly created bed file. >Mouse|chr12:112380949-112381824

fasta header bed interval

updated 6.3 years ago • rbronste

I am sure that someone will do this work faster and better than me. I would like to edit multiple fasta header from this format. >M01380:50:000000000-AV1DH:1:1101:16094:3001 1:N:0:M636:16S_V1V3 TTCTGCCT|0|TAGACCTA|0 CS1_534R_YM3_for...3|27| to this one: >M636 As you can see "M636" is embedded in the mayor header. Thank you for always helping everybody! D

header edition fasta

updated 6.9 years ago • DVR

I want to extract **gene name** , **gene start position** and **gene stop position** from the fasta header of the fasta file. I have tried to extract based on the position but those locations are not consistent. Is there...and 17th element from this list. It works for this particular example. This does not work for other headers where these positions are different. Usually, gene name is consisten…

fasta R string

updated 3.9 years ago • lokraj2003

Hello! I have a FASTA file and I need a script that read in the file, changes all the headers to e new format and writes out all the sequences in...Hello! I have a FASTA file and I need a script that read in the file, changes all the headers to e new format and writes out all the sequences in a new output file. The modified headers should contain, for each sequence, the species name (with "_" r…

FASTA Python script headers

updated 6.4 years ago • mpbiology.dna

Hi all, I would like to rename the headers of my fasta with a list of IDs. For each sequence I have this type of header . >range=chr16:946803-947997 In a separated...txt file I have a list of IDs, in the same order than the sequences, that I would like to use as headers. I guess a simple approach based on bash/awk/sed should work, but I couldn't manage to do so. Cheers

fasta

updated 16 months ago • Bertrand

How to take a specific column in sequence header identifiers of fasta file? I am having my header such as: ``` >PGM0100236.1 [Candida] scaffold00238 >PGM0100236.1 [Candida...scaffold00241 ``` I would like to take my third column alone i.e scaffold00238 for all the headers in my fasta file. Please give a simple command solution. I am new to bioinfo and linux script. Thank you

Fasta

updated 19 months ago • palani

Dear all, I want to add a special character "/1" to eacf of fasta header (at the end of fasta header) in a 8.5 GB fasta file. I used following command; perl -p -e 's/^(>.*)$/$1-New_Header_info/g' input.fasta

sequencing

updated 9.3 years ago • vahapel

Hi Everyone, I have a .fasta file with functional and GO annotations. I also have an associated GFF3 file with the locations of these genes in the...genome. The IDs from the GFF3 and fasta match. I need to append the annotation information from the fasta header to the notes column of the appropriate lines...in the GFF3 file. something like this: Fasta Headers: >evm.model.Scaffold_…

gff annotation fasta genome

updated 3.0 years ago • EJB

Hi FASTA header looks like: >1570-13.segment.flu1_PB2 >1570-13.segment.flu2_PB1 >1570-13.segment.flu3_PA etc Filenames...looks like: 201301234.fasta I want to have FASTA headers that looks like: >201301234_PB2 >201301234_PB1 >201301234_PA I have seen this answer: https://www.biostars.org

bash

updated 5.0 years ago • SaltedPork

In a typical FASTA file, how can the header be used as its filename (i.e., replace the current file name with header ID) ? I have multiple such FASTA

sequence

updated 6.3 years ago • cerulean

I have a fasta files that has more than 2.7 million headers. I want to break it into chunks. >gene1 ACTG... >gene2 ATTT... ... >gene2,700,000...grep -n "^>" my.fasta > headersofmy.fasta This gives me the positional information of the headers. 1:>gene1 4:>gene2 11:>gene3 ... n:&…

bash python fasta

updated 5.7 years ago • sicat.paolo20

My fasta headers of my FASTA file go like this: ``` >M02529:151:000000000-AJBNG:1:1101:20806:3573:133 TGGGGAATTGTTCGCAATGGGCGCAAGCCTGACGACGCAACGCC...The "133" is the sample name, and I need it at the beginning of the header followed by a dot, like this: ``` >:133.M02529:151:000000000-AJBNG:1:1101:20806:3573 TGGGGAATTGTTCGCAATGGGCGCAAGCCTGACGACGCAACGCC...I would be glad to get a 's…

header fasta

updated 24 months ago • fibar

Hi! So I have a FASTA file containing sequences, I want to replace old FASTA headers with new ones, and the first step to do so is to match with...the header names. It's the name I want the match with, so after the '>'. How do I do this? All sequences have headers somewhat like this...gt;Halobacterium_salinarum This is the part of the code where I find the headers: while (my $l…

Perl

updated 5.5 years ago • Mimmi Ahlmén

Hi everyone, I've been trying to edit the headers of my fasta file which is intend to upload on NCBI TSA. Can't seem to successfully upload my file on TSA and if im not mistaken...it could be because of the header format. The headers of my fasta file are as below: >TRINITY_DN1078649_c1_g1_i1 len=235 path=[0:0-234] >TRINITY_DN1078643_c0_g1_i1

NCBI fasta unix TSA RNA

updated 7 months ago • sumitra.20

Hey guys, I have tons of protein multi-fasta files and I would like to append the name of the file to the fasta-headers. For example, for a input file one.txt with the...headers >1 ATGC... >2 ATGCAT... I would like to have the output >one_1 ATGC... >one_2 ATGCAT... I use bbrename for DNA sequences, but

sequence

updated 4.1 years ago • genomes_and_MGEs

Hi All, I want to remove empty fasta headers from the fasta file. I used the commands from [this biostar post][1] but they seems to work only for nt sequence file

sequence

updated 2.6 years ago • GP

Hi, I have very little experience with scripts. I want to change my FASTA sequence headers (I have 100's of FASTA sequences per file) from very long headers to headers with the sample name (CM1) and

FASTA Header

updated 6.4 years ago • mollysil

Hi everyone, I have a multi-fasta file name multi.fasta with the following structure: >A 124 B ATCGTA... >C 567 D GTCAG... My goal is to create a new file, with...the new fasta-headers containing only the first and last column. If I use awk -F" " '/>/ {print $1,$(NF)}' multi-fasta > modified_multi-fasta This...will print the fasta headers with …

sequence

updated 22 months ago • genomes_and_MGEs

I have a concatenated fasta file for a series of genbank entries with different headers. I need to edit the fasta headers to all say "BCH" in place of...the header up to and including the space after "Archilocus alexandri". For example, DQ432746.1 Archilochus alexandri voucher

fasta alignment

updated 4.6 years ago • selplat21

there must be a solution somewhere to my issue, but until now I could not find it. I have a list of fasta headers that I want to use to select a subset of genes from a fasta file that was created using the RAST annotation pipeline...The headers look like this: ``` 160798.5.peg.2 160798.5.peg.12 160798.5.peg.123 160798.5.peg.1234 ``` My problem is that if I use...to do this with sed or awk, but…

grep oneliner fasta

updated 2.4 years ago • thhaverk

I have more than 5000 fasta sequence in a file and want to add a word , for instance phosphate, to header of all sequence. please tell me a PERL solution

fasta

updated 9.0 years ago • Palu

Hi all I have a fasta file that i want to extract just header of sequences. is there any perl code or some thing like this to do that? thanks a lot

perl fasta python parsing

updated 12.4 years ago • Mohammad Reza Bakhtiarizadeh

Hi everybody, I have two fasta file with two kinds of header format, I want to replace sequences of interest in one file based on header name (I have two...list of headers for two fasta files as txt format). Could you please advise me what should I do? For example header format for file 1 and

alignment Assembly sequencing RNA-Seq

updated 17 months ago • seta

taxonomic information assigned for each one of these MAGs but for downstream analysis I need the fasta headers to contain the taxonomic information that GTDB-tk assigned. This is how the fasta headers of one of the MAGs looks...if there is a way of extract the full taxonomy of the following table and give it to the respective fasta headers of a MAG: ![sample_table][1] So this is the desired ou…

MAGs taxonomy fasta

updated 22 months ago • v.berriosfarias

In a multifasta file the fasta header having full details as follows: ">ENSMUSG0000005892|ENSMUST00000004524351|xclkvsldjldjkfklasdfjalsjk

sequence extraction fasta perl

updated 10.8 years ago • Abdul Rawoof

I would like to filter my fasta file using a regexp on header. For exemple, keep only sequence where size != 0 >A1;size=43 ACGTATATATATATATATAT >A1;size

Fasta filter header regexp

updated 7.8 years ago • sacha

Aloha! I have a fasta file that looks like the following: >FFSA34B_100_M7_ID10014 ATCTAACAATGTTGCTCATGCAGGCCCTGCAGTAGATTTAACCATTCTATCCCTTCACCTAGCAGGTGTATCCTCCTTAATAGGAGCCATCAATTTTACAACTACTATTGCTAACAGACGTTTAGAAGGTATACCTACAGAAAAAATACCCTTATTTATT...43 And I would like to append the second column of the text file to the matching fasta header to produce the following output: &…

sequence

updated 5.5 years ago • timmers

I need to reformat headers in a fasta file with headers such as: >Agaricus_chiangmaiensis|JF514531|SH174817.07FU|reps|k__Fungi;p__Basidiomycota

next-gen sequencing fasta headers

updated 6.4 years ago • jack1120

Hi everyone; this is my first question on the forum. How can I compare if two fasta files contain the same sequence headers? Does any BioPython module exist for doing this? Thanks in advance, peixe

fasta comparison sequence biopython

updated 12.9 years ago • Peixe

seq_file = sys.argv[1] labels = seq_file.split(".") # converting the file from fastq to fasta SeqIO.convert(seq_file,"fastq",labels[0]+".fasta","fasta") # taking the converted file and then changing the fasta header handle...used seq_record.description = "" # this strips the old header out SeqIO.write(seq_record, handle,"fasta") handle.close(…

biopython

updated 8.2 years ago • skbrimer

Hi, I would like to parse a fasta file and get all headers and seqs that match some strings (so called pattern below). What happens is that all the headers...import re # file with FASTA sequence infile = "seq.fa" # File looks…

biopython python fasta regex

updated 6.7 years ago • David

Hey everyone, I have a multi-fasta file like this: >NC_000914 464618..534825 gtgccttccattttggagcgggaccaaatcgcagcggttctggtaagtgcgagcagggac...Hey everyone, I have a multi-fasta file like this: >NC_000914 464618..534825 gtgccttccattttggagcgggaccaaatcgcagcggttctggtaagtgcgagcagggac gtgccttccattttggagcgggaccaaatcgcagcggttctggtaagtgcgagcagggac...I would like to remove w…

sequence

updated 2.7 years ago • genomes_and_MGEs

Hi, I am trying to remove the last 5 characters from my FASTA header in my sequencing data. I have ≈400,000 sequences and have tried to use sed command in terminal to do this for me. Input...gt;1-4 TAGGGAGA How can I use sed command to remove the last 5 characters from my FASTA headers

FASTA header sed

updated 4.1 years ago • angela1

files: - The first is a tab file. In its first column i have list of location and description of fasta sequence in the 2nd column. - The second is a multi fasta file. Some sequences begin with a normal header and others with...the location in it. I'd like to compare the two files and replace the "LOC" header in the multi fasta with the location and the corresponding description in the tab fi…

RNA-Seq perl

updated 7.0 years ago • Amy

Hai I have a Fasta file like **GCA_001609185.1_ASM160918v1_genomic.fsa** and i want to change header of this fasta file like this **>GCA_001609185.1_ASM160918v1_genomic

fasta awk

updated 7.8 years ago • akhilvbioinfo

Hey guys, I have a multi-fasta file containing several extracted regions, such as >NZ_KI973281.1_1234..56789 atattgagctaaaaaaatcagttttccca...32476 tgcagaagtaagggggtaacaccatgcct... ... I would like to include strain name on fasta header, such as >Enterobacter_sp._MGH_6_NZ_KI973281.1_1234..56789 atattgagctaaaaaaatcagttttccca... >Enterobacter_horm…

genome assembly

updated 5.2 years ago • genomes_and_MGEs

Pls help, anyone know how to extract multiple header from a fasta file using perl

sequence

updated 7.6 years ago • fongsiongshawn

Here is an example of the header from the FASTA file >PSR83604 cdna supercontig:Red5_PS1_1.69.0:ps1sf1427:11608:20559:-1 gene:CEY00_Acc33586 gene_biotype...to keep just the 'description' part which contains the protein name and remove rest of them from the header. I tried using 'sed', but I'm going wrong somewhere. Can someone help? Thank you in advance

fasta FASTA genome cDNA

updated 3.9 years ago • Vignesh

I would like to change the fasta header For example, I have following sequences ``` >NR_130660.1 Hanseniaspora uvarum CBS 314 ITS region; from TYPE material

fasta

updated 14 months ago • fastamasterfromnow

Hi I have around 85 gene sequences in individual fasta files. I'd like to rename each file with their header name containing the gene name in [gene=]. For each header, I only want what...is in-between the brackets. I'm trying to do this through linux commands. in fasta file input ``` >lcl|NC_018552.1_cds_YP_006666009.1_1 [gene=rps12] [locus_tag=C329_pgp044] [db_xref=GeneID:13540299...gb…

Linux fasta

updated 5 months ago • sebabiokr

Hi I have a fasta file anotated and I want to add to the first position after > the next word to 'Similar to' >_Anouracaudifer_00017283...2\1\2/' file.txt > new_file_2.txt` and store it in a new file and tried to paste it into the headers but it does not work , any ideas

fasta changingheader preprocessing header

updated 12 months ago • Diana Nadia

Hello, I am creating a custom TE library and need fasta file headers to be in a specific format. If I have file 1 with headers like such: >L2-10_EL__1_000087d4-94a9-4af9-a82b...Hopefully that makes sense? I'm worried that this isn't possible due to the odd format of the fasta file headers. Thank you in advance

python fasta bash

updated 3.1 years ago • Ava

Hello! I have a FASTA file and I want to change their headers into a new name. Through searching here on this platform I have found some relevant...but not saved. Also a huge portion of sequences is removed...and I do the first sequences their headers are not named..any idea what could be the problem. I am using Linus konsol. My input sequences > LTR-12 ATTGGAAAACAAACTATCCTACCTTC…

sequence

updated 3.6 years ago • lukhanyomakhabane

Hello I have a .ffn file having 1000 sequences. I wanted to check whether all the fasta headers have sequence underneath them, and it would be great if I also get to know about the fasta headers which do not

sequencing sequence

updated 22 months ago • utkarsh.sood

need from a database using the following bioperl code: use strict; use Bio::SearchIO; use Bio::DB::Fasta; my ($file, $id, $start, $end) = ("secondround_merged_expanded.fasta","C7136661:0-107",1,10); my $db = Bio::DB::Fasta->new($file); my $seq = $db-&gt...seq($id, $start, $end); print $seq,"\n"; Where the header of the sequence I'm trying to extract is: C7136661:0-107, as in the …

bioperl perl fasta

updated 4.4 years ago • jason.r.gallant

Hi I need help writing a command to remove part of a header from my scaffold fasta file. I have headers that look like >scaffold3247|size3454 TTATATAACTAATTAGATAAAATAGCTAATAATAAAAGCTTCTATATAACTAGCCTTCTTTTAATCTATATAATAAGCTTAGCTAATAAAAAGGCCCACT

fasta

updated 18 months ago • kcl58759

Hi I have thousands(1000's) of fasta files in one directory, I want to replace all the fasta file headers with the same keyword **>Gast_superba** ?? suggestions

gene genome

updated 2.1 years ago • sunnykevin97

Hi! I have two files: one is protein fasta file (`a.fa`) & another is `header.txt`. I want to get my sequences in the same order as the header file. How can I do this

fasta

updated 5 months ago • Nelo

Hi everyone, I'm encountering a problem with too long fasta headers. They get truncated at the 20th position by a program (TargetP) I'm using. Example: ``` >ConsensusfromContig10000...entries named "ConsensusfromContig1". Is there any software or any script I can use to rename the headers in a way that they are 20 characters long and still able to get identified? I have only found scri…

fasta

updated 16 months ago • branokdrung

Hi there, I have fasta files with header @AS500187:87:J5LBGHRXX:2:11101:7742:1046 1:N:0:21 I want to replace `@AS500187:87:J5LBGHRXX:2` with `@AS500187

sed sequence

updated 22 months ago • bnina9999

11,388 results • Page 2 of 228

Recent Votes

Comment: Converting Degree Minutes (DM) to Decimal Degrees (DD) using python script

Answer: Details on salmon index

ATAC-seq sample normalization

Answer: Faster Needleman-Wunsch rapid global alignment of two sequences?

Comment: Faster Needleman-Wunsch rapid global alignment of two sequences?

Comment: Can I perform a correlation test with 3 biological replicates per condition?

Answer: --normalizeUsing RPGC

Recent Locations • All